26 research outputs found

    Ancient exapted transposable elements promote nuclear enrichment of human long noncoding RNAs

    Get PDF
    The sequence domains underlying long noncoding RNA (lncRNA) activities, including their characteristic nuclear enrichment, remain largely unknown. It has been proposed that these domains can originate from neofunctionalized fragments of transposable elements (TEs), otherwise known as RIDLs (repeat insertion domains of lncRNA), although just a handful have been identified. It is challenging to distinguish functional RIDL instances against a numerous genomic background of neutrally evolving TEs. We here show evidence that a subset of TE types experience evolutionary selection in the context of lncRNA exons. Together these comprise an enrichment group of 5374 TE fragments in 3566 loci. Their host lncRNAs tend to be functionally validated and associated with disease. This RIDL group was used to explore the relationship between TEs and lncRNA subcellular localization. By using global localization data from 10 human cell lines, we uncover a dosedependent relationship between nuclear/cytoplasmic distribution and evolutionarily conserved L2b, MIRb, and MIRc elements. This is observed in multiple cell types and is unaffected by confounders of transcript length or expression. Experimental validation with engineered transgenes shows that these TEs drive nuclear enrichment in a natural sequence context. Together these data reveal a role for TEs in regulating the subcellular localization of lncRNAs.This research was funded by the NCCR “RNA & Disease” funded by the Swiss National Science Foundation and by the Medical Faculty of the University and University Hospital of Bern

    Discovery of Cancer Driver Long Noncoding RNAs across 1112 Tumour Genomes: New Candidates and Distinguishing Features

    Get PDF
    Long noncoding RNAs (lncRNAs) represent a vast unexplored genetic space that may hold missing drivers of tumourigenesis, but few such "driver lncRNAs" are known. Until now, they have been discovered through changes in expression, leading to problems in distinguishing between causative roles and passenger effects. We here present a different approach for driver lncRNA discovery using mutational patterns in tumour DNA. Our pipeline, ExInAtor, identifies genes with excess load of somatic single nucleotide variants (SNVs) across panels of tumour genomes. Heterogeneity in mutational signatures between cancer types and individuals is accounted for using a simple local trinucleotide background model, which yields high precision and low computational demands. We use ExInAtor to predict drivers from the GENCODE annotation across 1112 entire genomes from 23 cancer types. Using a stratified approach, we identify 15 high-confidence candidates: 9 novel and 6 known cancer-related genes, including MALAT1, NEAT1 and SAMMSON. Both known and novel driver lncRNAs are distinguished by elevated gene length, evolutionary conservation and expression. We have presented a first catalogue of mutated lncRNA genes driving cancer, which will grow and improve with the application of ExInAtor to future tumour genome projects

    LnCompare: gene set feature analysis for human long non-coding RNAs.

    Get PDF
    Interest in the biological roles of long noncoding RNAs (lncRNAs) has resulted in growing numbers of studies that produce large sets of candidate genes, for example, differentially expressed between two conditions. For sets of protein-coding genes, ontology and pathway analyses are powerful tools for generating new insights from statistical enrichment of gene features. Here we present the LnCompare web server, an equivalent resource for studying the properties of lncRNA gene sets. The Gene Set Feature Comparison mode tests for enrichment amongst a panel of quantitative and categorical features, spanning gene structure, evolutionary conservation, expression, subcellular localization, repetitive sequences and disease association. Moreover, in Similar Gene Identification mode, users may identify other lncRNAs by similarity across a defined range of features. Comprehensive results may be downloaded in tabular and graphical formats, in addition to the entire feature resource. LnCompare will empower researchers to extract useful hypotheses and candidates from lncRNA gene sets

    Cancer LncRNA Census reveals evidence for deep functional conservation of long noncoding RNAs in tumorigenesis.

    Get PDF
    Long non-coding RNAs (lncRNAs) are a growing focus of cancer genomics studies, creating the need for a resource of lncRNAs with validated cancer roles. Furthermore, it remains debated whether mutated lncRNAs can drive tumorigenesis, and whether such functions could be conserved during evolution. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, we introduce the Cancer LncRNA Census (CLC), a compilation of 122 GENCODE lncRNAs with causal roles in cancer phenotypes. In contrast to existing databases, CLC requires strong functional or genetic evidence. CLC genes are enriched amongst driver genes predicted from somatic mutations, and display characteristic genomic features. Strikingly, CLC genes are enriched for driver mutations from unbiased, genome-wide transposon-mutagenesis screens in mice. We identified 10 tumour-causing mutations in orthologues of 8 lncRNAs, including LINC-PINT and NEAT1, but not MALAT1. Thus CLC represents a dataset of high-confidence cancer lncRNAs. Mutagenesis maps are a novel means for identifying deeply-conserved roles of lncRNAs in tumorigenesis

    Analyses of non-coding somatic drivers in 2,658 cancer whole genomes.

    Get PDF
    The discovery of drivers of cancer has traditionally focused on protein-coding genes1-4. Here we present analyses of driver point mutations and structural variants in non-coding regions across 2,658 genomes from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium5 of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). For point mutations, we developed a statistically rigorous strategy for combining significance levels from multiple methods of driver discovery that overcomes the limitations of individual methods. For structural variants, we present two methods of driver discovery, and identify regions that are significantly affected by recurrent breakpoints and recurrent somatic juxtapositions. Our analyses confirm previously reported drivers6,7, raise doubts about others and identify novel candidates, including point mutations in the 5' region of TP53, in the 3' untranslated regions of NFKBIZ and TOB1, focal deletions in BRD4 and rearrangements in the loci of AKR1C genes. We show that although point mutations and structural variants that drive cancer are less frequent in non-coding genes and regulatory sequences than in protein-coding genes, additional examples of these drivers will be found as more cancer genomes become available

    Global Positioning System: Understanding Long Noncoding RNAs through Subcellular Localization.

    No full text
    The localization of long noncoding RNAs (lncRNAs) within the cell is the primary determinant of their molecular functions. LncRNAs are often thought of as chromatin-restricted regulators of gene transcription and chromatin structure. However, a rich population of cytoplasmic lncRNAs has come to light, with diverse roles including translational regulation, signaling, and respiration. RNA maps of increasing resolution and scope are revealing a subcellular world of highly specific localization patterns and hint at sequence-based address codes specifying lncRNA fates. We propose a new framework for analyzing sequencing-based data, which suggests that numbers of cytoplasmic lncRNA molecules rival those in the nucleus. New techniques promise to create high-resolution, transcriptome-wide maps associated with all organelles of the mammalian cell. Given its intimate link to molecular roles, subcellular localization provides a means of unlocking the mystery of lncRNA functions

    Cytoplasmic long noncoding RNAs are frequently bound to and degraded at ribosomes in human cells.

    Get PDF
    Recent footprinting studies have made the surprising observation that long noncoding RNAs (lncRNAs) physically interact with ribosomes. However, these findings remain controversial, and the overall proportion of cytoplasmic lncRNAs involved is unknown. Here we make a global, absolute estimate of the cytoplasmic and ribosome-associated population of stringently filtered lncRNAs in a human cell line using polysome profiling coupled to spike-in normalized microarray analysis. Fifty-four percent of expressed lncRNAs are detected in the cytoplasm. The majority of these (70%) have >50% of their cytoplasmic copies associated with polysomal fractions. These interactions are lost upon disruption of ribosomes by puromycin. Polysomal lncRNAs are distinguished by a number of 5' mRNA-like features, including capping and 5'UTR length. On the other hand, nonpolysomal "free cytoplasmic" lncRNAs have more conserved promoters and a wider range of expression across cell types. Exons of polysomal lncRNAs are depleted of endogenous retroviral insertions, suggesting a role for repetitive elements in lncRNA localization. Finally, we show that blocking of ribosomal elongation results in stabilization of many associated lncRNAs. Together these findings suggest that the ribosome is the default destination for the majority of cytoplasmic long noncoding RNAs and may play a role in their degradationThis work was supported by grants Ramón y Cajal RYC-2011-08851, Plan Nacional BIO2011-27220 to R.J., and a 2014 FPI-Severo Ochoa fellowship to J.C

    Ancient exapted transposable elements promote nuclear enrichment of human long noncoding RNAs.

    Get PDF
    The sequence domains underlying long noncoding RNA (lncRNA) activities, including their characteristic nuclear enrichment, remain largely unknown. It has been proposed that these domains can originate from neofunctionalized fragments of transposable elements (TEs), otherwise known as RIDLs (repeat insertion domains of lncRNA), although just a handful have been identified. It is challenging to distinguish functional RIDL instances against a numerous genomic background of neutrally evolving TEs. We here show evidence that a subset of TE types experience evolutionary selection in the context of lncRNA exons. Together these comprise an enrichment group of 5374 TE fragments in 3566 loci. Their host lncRNAs tend to be functionally validated and associated with disease. This RIDL group was used to explore the relationship between TEs and lncRNA subcellular localization. By using global localization data from 10 human cell lines, we uncover a dose-dependent relationship between nuclear/cytoplasmic distribution and evolutionarily conserved L2b, MIRb, and MIRc elements. This is observed in multiple cell types and is unaffected by confounders of transcript length or expression. Experimental validation with engineered transgenes shows that these TEs drive nuclear enrichment in a natural sequence context. Together these data reveal a role for TEs in regulating the subcellular localization of lncRNAs
    corecore